Supervised feature selection in graphs with path coding penalties and network flows

نویسندگان

  • Julien Mairal
  • Bin Yu
چکیده

We consider supervised learning problems where the features are embedded in a graph, such as gene expressions in a gene network. In this context, it is of much interest to automatically select a subgraph with few connected components; by exploiting prior knowledge, one can indeed improve the prediction performance or obtain results that are easier to interpret. Regularization or penalty functions for selecting features in graphs have recently been proposed, but they raise new algorithmic challenges. For example, they typically require solving a combinatorially hard selection problem among all connected subgraphs. In this paper, we propose computationally feasible strategies to select a sparse and well-connected subset of features sitting on a directed acyclic graph (DAG). We introduce structured sparsity penalties over paths on a DAG called “path coding” penalties. Unlike existing regularization functions that model long-range interactions between features in a graph, path coding penalties are tractable. The penalties and their proximal operators involve path selection problems, which we efficiently solve by leveraging network flow optimization. We experimentally show on synthetic, image, and genomic data that our approach is scalable and leads to more connected subgraphs than other regularization functions for graphs.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Path Coding Penalties for Directed Acyclic Graphs

We consider supervised learning problems where the features are embedded in a graph, such as gene expressions in a gene network. In this context, it is of much interest to automatically select a subgraph which has a small number of connected components, either to improve the prediction performance, or to obtain better interpretable results. Existing regularization or penalty functions for this ...

متن کامل

Machine learning based Visual Evoked Potential (VEP) Signals Recognition

Introduction: Visual evoked potentials contain certain diagnostic information which have proved to be of importance in the visual systems functional integrity. Due to substantial decrease of amplitude in extra macular stimulation in commonly used pattern VEPs, differentiating normal and abnormal signals can prove to be quite an obstacle. Due to developments of use of machine l...

متن کامل

Optimal Coding Subgraph Selection under Survivability Constraint

Nowadays communication networks have become an essential and inevitable part of human life. Hence, there is an ever-increasing need for expanding bandwidth, decreasing delay and data transfer costs. These needs necessitate the efficient use of network facilities. Network coding is a new paradigm that allows the intermediate nodes in a network to create new packets by combining the packets recei...

متن کامل

Online Streaming Feature Selection Using Geometric Series of the Adjacency Matrix of Features

Feature Selection (FS) is an important pre-processing step in machine learning and data mining. All the traditional feature selection methods assume that the entire feature space is available from the beginning. However, online streaming features (OSF) are an integral part of many real-world applications. In OSF, the number of training examples is fixed while the number of features grows with t...

متن کامل

Supervised Learning Real-time Traffic Classifiers

Network traffic classification plays an important role in various network activities. Due to the ineffectiveness of traditional port-based and payload-based methods, recent works proposed using machine learning methods to classify flows based on statistical characteristics. In this study, we present a comprehensive evaluation of the effectiveness of these statistical methods for real-time traff...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Journal of Machine Learning Research

دوره 14  شماره 

صفحات  -

تاریخ انتشار 2013